AITopics | dangerous behavior

Collaborating Authors

dangerous behavior

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

903ceb0ed2d5ceec6e2c9b317b6c54a8-Paper-Conference.pdf

Neural Information Processing SystemsJun-19-2026, 18:57:06 GMT

Recent advances in Large Vision-Language Models (LVLMs) have showcased strong reasoning abilities across multiple modalities, achieving significant breakthroughs in various real-world applications. Despite this great success, the safety guardrail of LVLMs may not cover the unforeseen domains introduced by the visual modality. Existing studies primarily focus on eliciting LVLMs to generate harmful responses via carefully crafted image-based jailbreaks designed to bypass alignment defenses. In this study, we reveal that a safe image can be exploited to achieve the same jailbreak consequence when combined with additional safe images and prompts. This stems from two fundamental properties of LVLMs: universal reasoning capabilities and safety snowball effect. Building on these insights, we propose Safety Snowball Agent (SSA), a novel agent-based framework leveraging agents' autonomous and tool-using abilities to jailbreak LVLMs. SSAoperates through two principal stages: (1) initial response generation, where tools generate or retrieve jailbreak images based on potential harmful intents, and (2) harmful snowballing, where refined subsequent prompts induce progressively harmful outputs. Our experiments demonstrate that SSAcan use nearly any image to induce LVLMs to produce unsafe content, achieving high success jailbreaking rates against the latest LVLMs. Unlike prior works that exploit alignment flaws, SSAleverages the inherent properties of LVLMs, presenting a profound challenge for enforcing safety in generative multimodal systems.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Safe + Safe = Unsafe? Exploring How Safe Images Can Be Exploited to Jailbreak Large Vision-Language Models

Cui, Chenhang, Deng, Gelei, Zhang, An, Zheng, Jingnan, Li, Yicong, Gao, Lianli, Zhang, Tianwei, Chua, Tat-Seng

arXiv.org Artificial IntelligenceNov-27-2024

Recent advances in Large Vision-Language Models (LVLMs) have showcased strong reasoning abilities across multiple modalities, achieving significant breakthroughs in various real-world applications. Despite this great success, the safety guardrail of LVLMs may not cover the unforeseen domains introduced by the visual modality. Existing studies primarily focus on eliciting LVLMs to generate harmful responses via carefully crafted image-based jailbreaks designed to bypass alignment defenses. In this study, we reveal that a safe image can be exploited to achieve the same jailbreak consequence when combined with additional safe images and prompts. This stems from two fundamental properties of LVLMs: universal reasoning capabilities and safety snowball effect. Building on these insights, we propose Safety Snowball Agent (SSA), a novel agent-based framework leveraging agents' autonomous and tool-using abilities to jailbreak LVLMs. SSA operates through two principal stages: (1) initial response generation, where tools generate or retrieve jailbreak images based on potential harmful intents, and (2) harmful snowballing, where refined subsequent prompts induce progressively harmful outputs. Our experiments demonstrate that \ours can use nearly any image to induce LVLMs to produce unsafe content, achieving high success jailbreaking rates against the latest LVLMs. Unlike prior works that exploit alignment flaws, \ours leverages the inherent properties of LVLMs, presenting a profound challenge for enforcing safety in generative multimodal systems. Our code is avaliable at \url{https://github.com/gzcch/Safety_Snowball_Agent}.

large language model, lvlm, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2411.11496

Country:

North America > United States (1.00)
Asia > Russia (1.00)
Europe > Switzerland > Zürich > Zürich (0.14)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Materials > Chemicals (1.00)
Leisure & Entertainment (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety

Zhang, Zaibin, Zhang, Yongting, Li, Lijun, Gao, Hongzhi, Wang, Lijun, Lu, Huchuan, Zhao, Feng, Qiao, Yu, Shao, Jing

arXiv.org Artificial IntelligenceJan-22-2024

Multi-agent systems, augmented with Large Language Models (LLMs), demonstrate significant capabilities for collective intelligence. However, the potential misuse of this intelligence for malicious purposes presents significant risks. To date, comprehensive research on the safety issues associated with multi-agent systems remains limited. From the perspective of agent psychology, we discover that the dark psychological states of agents can lead to severe safety issues. To address these issues, we propose a comprehensive framework grounded in agent psychology. In our framework, we focus on three aspects: identifying how dark personality traits in agents might lead to risky behaviors, designing defense strategies to mitigate these risks, and evaluating the safety of multi-agent systems from both psychological and behavioral perspectives. Our experiments reveal several intriguing phenomena, such as the collective dangerous behaviors among agents, agents' propensity for self-reflection when engaging in dangerous behavior, and the correlation between agents' psychological assessments and their dangerous behaviors. We anticipate that our framework and observations will provide valuable insights for further research into the safety of multi-agent systems. We will make our data and code publicly accessible at https:/github.com/AI4Good24/PsySafe.

agent, dangerous behavior, multi-agent system, (14 more...)

arXiv.org Artificial Intelligence

2401.1188

Country:

North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)

Add feedback

Can AI Video Analytics Ever Really Be Intelligent?

#artificialintelligenceNov-22-2019, 22:22:27 GMT

Video surveillance is commonly associated with security. But in most cases, it's used to record incidents and assist in investigations after the fact rather than prevent undesirable events. Artificial intelligence–powered video analytics is a highly promising trend that fundamentally changes the way things work. Extracting manageable data from a video stream can help recognize risky situations early on, minimizing damage and, ideally, completely avoid emergencies. At the same time, AI significantly expands the areas of application of video surveillance beyond security systems.

neural network, surveillance, video surveillance, (10 more...)

#artificialintelligence

Industry: Commercial Services & Supplies > Security & Alarm Services (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.36)
Information Technology > Artificial Intelligence > Vision (0.30)

Add feedback

Artificial Intelligence to Make Petrol Pumps Safer by Picking Out Dangerous Behavior

#artificialintelligenceMar-17-2019, 07:37:41 GMT

Artificial intelligence (AI), machine learning (ML) and continual deep learning (DL) are the new age digital skills that are being expected to transform the consumer and enterprise experience. Due to the vast amount of data that is now available in the Internet domain, machine learning and deep leaning have the capability to predict and prevent various catastrophically dangerous events. Now, Shell wants to leverage artificial intelligence to make petrol pumps a safer place. Shell has selected C3 IoT and Microsoft Azure to power a new companywide AI platform. A device inside petrol pumps running the Microsoft Azure IoT Edge can use artificial intelligence tools to pick out dangerous behavior like people lighting cigarettes while waiting at the pump, people driving recklessly, theft, and improper fueling.

ai platform, artificial intelligence, machine learning, (6 more...)

#artificialintelligence

Country: Asia (0.18)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Twitter found to block certain words in search engine

Daily Mail - Science & techMar-29-2017, 03:14:01 GMT

Twitter has quietly started blocking certain words on the platform's built-in search engine. Words such as'porn', 'nsfw', 'sex' and similar terms will no longer appear when searched under'Latest' tab – but, racial slurs and the word'jihad' have not been removed. Although Twitter has blocked these words from being found in the Latest tab, users can still find some of the'forbidden' terms by searching in the'Top' tab. Twitter has quietly started blocking certain words on the platform's built-in search engine. Words such as'porn', 'nsfw', 'sex' and similar terms will no longer appear when searched under'Latest' tab Twitter says it'prohibits the promotion of hate content, sensitive topics, and violence globally.' But this policy does not apply to news and information that calls attention to hate, sensitive topics, or violence, but does not advocate for it.

artificial intelligence, information retrieval, natural language, (17 more...)

Daily Mail - Science & tech

Country:

Asia > Middle East > Republic of Türkiye (0.33)
Europe > Germany (0.06)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.82)

Add feedback